In \(k\)-fold cross-validation, we start by dividing the data randomly into \(k\) approximately equal groups or folds. The schematic here shows 5-fold cross-validation.
The model is fit on \(k-1\) of the folds and the remaining fold is used to evaluate the model. Let’s look at the first row in the schematic. Here the model would be fit on the data that are in folds 2, 3, 4, and 5. The model would be evaluated on the data in fold 1. If we are fitting a linear regression model, that means we would compute the RMSE for fold 1. In the second row, the model is fit on the data in folds 1, 3, 4, and 5 and the RMSE would be computed for the data in the 2nd fold.
After this is done for all 5 folds, we take the average of the performance statistic, like RMSE for linear regression models, to obtain the overall performance. This overall error is sometimes called the CV error. Averaging the performance over \(k\) folds gives a better estimate of the true error than just using one hold-out set. It also allows us to estimate its variability.